Search results for "Parallel process"
showing 10 items of 34 documents
CUSHAW2-GPU: Empowering Faster Gapped Short-Read Alignment Using GPU Computing
2014
We present CUSHAW2-GPU to accelerate the CUSHAW2 algorithm using compute unified device architecture (CUDA)-enabled GPUs. Two critical GPU computing techniques, namely intertask hybrid CPU-GPU parallelism and tile-based Smith-Waterman map backtracking using CUDA, are investigated to facilitate fast alignments. By aligning both simulated and real reads to the human genome, our aligner yields comparable or better performance compared to BWA-SW, Bowtie2, and GEM. Furthermore, CUSHAW2-GPU with a Tesla K20c GPU achieves significant speedups over the multithreaded CUSHAW2, BWA-SW, Bowtie2, and GEM on the 12 cores of a high-end CPU for both single-end and paired-end alignment.
Exploring parallel capabilities of an innovative numerical method for recovering image velocity vectors field
2010
In this paper an efficient method devoted to estimate the velocity vectors field is investigated. The method is based on a quasi-interpolant operator and involves a large amount of computation. The operations characterizing the computational scheme are ideal for parallel processing because they are local, regular and repetitive. Therefore, the spatial parallelism of the process is studied to rapidly proceed in the computation on distributed multiprocessor systems. The process has shown to be synchronous, with good task balancing and requiring a small amount of data transfer.
Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance
2019
The rapid development of the Internet of Things (IoT) has posed important changes in the way emerging acoustic signal processing applications are conceived. While traditional acoustic processing applications have been developed taking into account high-throughput computing platforms equipped with expensive multichannel audio interfaces, the IoT paradigm is demanding the use of more flexible and energy-efficient systems. In this context, algorithms for source localization and ranging in wireless acoustic sensor networks can be considered an enabling technology for many IoT-based environments, including security, industrial, and health-care applications. This paper is aimed at evaluating impo…
Application based on dynamic reconfiguration of field-programmable gate arrays: JPEG 2000 arithmetic decoder
2005
This paper describes the implementation of a part of the JPEG 2000 algorithm (MQ decoder and arithmetic decoder) on a field-programmable gate array (FPGA) board by using dynamic reconfiguration. A comparison between static and dynamic reconfiguration is presented, and new analysis criteria (spatiotemporal efficiency, logic cost, and performance time) have been defined. The MQ decoder and arithmetic decoder are attractive for dynamic reconfiguration implementation in applications without parallel processing. This implementation is done on an architecture designed to study the dynamic reconfiguration of FPGAs: the ARDOISE architecture. The obtained implementation, based on four partial config…
Parallel laser micromachining based on diffractive optical elements with dispersion compensated femtosecond pulses
2013
We experimentally demonstrate multi-beam high spatial resolution laser micromachining with femtosecond pulses. The effects of chromatic aberrations as well as pulse stretching on the material processed due to diffraction were significantly mitigated by using a suited dispersion compensated module (DCM). This permits to increase the area of processing in a factor 3 in comparison with a conventional setup. Specifically, 52 blind holes have been drilled simultaneously onto a stainless steel sample with a 30 fs laser pulse in a parallel processing configuration.
Accelerating short read mapping on an FPGA (abstract only)
2012
The explosive growth of short read datasets produced by high throughput DNA sequencing technologies poses a challenge to the mapping of short reads to a reference genome in terms of sensitivity and execution speed. Existing methods often use a restrictive error model for computing the alignments to improve speed, whereas more flexible error models are generally too slow for large-scale applications. Although a number of short read mapping software tools have been proposed, designs based on hardware are relatively rare. In this paper, we present a hybrid system for short read mapping utilizing both software and field programmable gate array (FPGA)-based hardware. The compute intensive semi-g…
Study of the Audio Susceptibility in Parallel Power Processing With a High-Power Topology
2009
In this paper, the audio susceptibility characteristic of a high-efficiency nonisolated topology that processes only a part of the total power delivery is analyzed. Since the proposed topology presents a direct path from input to output, the effect of the input voltage ripple at the output voltage has been studied. The effect on the audio susceptibility of the values and disposition of the components and the effect of their parasitic elements must be taken into account. Due to this study, the analytical expression of the audio susceptibility and the design criteria to improve it have been obtained.
Parallel In-Memory Evaluation of Spatial Joins
2019
The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-designing a classic partitioning-based algorithm to consider alternative approaches for space partitioning. Our study shows that, compared to a straightforward implementation of the algorithm, our tuning can improve performance significantly. We also show how to select appropriate partitioning parame…
GekkoFS - A Temporary Distributed File System for HPC Applications
2018
We present GekkoFS, a temporary, highly-scalable burst buffer file system which has been specifically optimized for new access patterns of data-intensive High-Performance Computing (HPC) applications. The file system provides relaxed POSIX semantics, only offering features which are actually required by most (not all) applications. It is able to provide scalable I/O performance and reaches millions of metadata operations already for a small number of nodes, significantly outperforming the capabilities of general-purpose parallel file systems. The work has been funded by the German Research Foundation (DFG) through the ADA-FS project as part of the Priority Programme 1648. It is also support…
The HPC Certification Forum: Toward a Globally Acknowledged HPC Certification
2020
The goal of the HPC Certification Forum is to categorize, define, and examine competencies expected from proficient HPC practitioners. The community-led forum is working toward establishing a globally acknowledged HPC certification process, a process that engages with HPC centers to identify gaps in users’ knowledge, and with users to identify the skills required to perform their tasks. In this article, we introduce the forum and summarize the progress made over the last two years. The release of the first officially supported certificate is planned for the second half of 2020.